Goto

Collaborating Authors

 statistical mechanical analysis


Reviews: Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis

Neural Information Processing Systems

It would make more sense to show results for data with low-dimensional structure, in which the first one or two are non-zero, and the rest are either zero or epsilon small. Do the conclusions for the two eigenvalues case still hold in this example? It is hard for me to see what I should learn from figures 5 and 6. - The dependence of the learning dynamics on the spectral properties of the input data is not new and was previously studies by Saxe et al. (ArXiv, 2013) for simple linear networks. It would be appropriate if these results were mentioned or discussed in the text. It has been previously showed that the initial conditions have a big impact on the trainability and learning dynamics of these networks. In this case, they would be defined as the initial conditions on the order parameters Q, R, and D. - The analysis here seems tractable only for networks with a small number of hidden units.


Reviews: Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis

Neural Information Processing Systems

This paper provides an analysis on dynamics of online learning of two-layer neural networks under the teacher-student scenario. The analysis extends that by Saad and Solla (1995) by considering a covariance matrix of the input which may not be proportional to the identity matrix. The main contribution of this paper is the finding that the plateau phenomenon observed in learning dynamics of nonlinear neural networks depends on statistics of input data. The three reviewers rated this paper above the acceptance threshold, mentioning originality and importance of the contribution of this paper. At the same time, two reviewers raised concern about clarity of presentation.


Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis

Neural Information Processing Systems

The plateau phenomenon, wherein the loss value stops decreasing during the process of learning, has been reported by various researchers. The phenomenon is actively inspected in the 1990s and found to be due to the fundamental hierarchical structure of neural network models. Then the phenomenon has been thought as inevitable. However, the phenomenon seldom occurs in the context of recent deep learning. There is a gap between theory and reality.


Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis

Yoshida, Yuki, Okada, Masato

Neural Information Processing Systems

The plateau phenomenon, wherein the loss value stops decreasing during the process of learning, has been reported by various researchers. The phenomenon is actively inspected in the 1990s and found to be due to the fundamental hierarchical structure of neural network models. Then the phenomenon has been thought as inevitable. However, the phenomenon seldom occurs in the context of recent deep learning. There is a gap between theory and reality. In this paper, using statistical mechanical formulation, we clarified the relationship between the plateau phenomenon and the statistical property of the data learned.